Skip to content

fix(cuda_std): use correct PTX scope suffix in atomic load/store#357

Merged
LegNeato merged 1 commit intoRust-GPU:mainfrom
Snehal-Reddy:fix/atomic-scope
Feb 26, 2026
Merged

fix(cuda_std): use correct PTX scope suffix in atomic load/store#357
LegNeato merged 1 commit intoRust-GPU:mainfrom
Snehal-Reddy:fix/atomic-scope

Conversation

@Snehal-Reddy
Copy link
Contributor

This PR fixes a bug in the load_scope! macro in crates/cuda_std/src/atomic/intrinsics.rs

Previously, the macro was using the $scope identifier (e.g., device, block) when generating the PTX instruction string. This resulted in invalid assembly instructions like ld.relaxed.device.u32.

This change updates the macro to use $scope_asm (e.g., gpu, cta), ensuring valid PTX suffixes are generated (e.g., ld.relaxed.gpu.u32).

fixes #354

@LegNeato LegNeato added this pull request to the merge queue Feb 26, 2026
@LegNeato
Copy link
Contributor

Thanks!

Merged via the queue into Rust-GPU:main with commit b8ba547 Feb 26, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Invalid PTX generation for atomic load/store: incorrect scope suffix

2 participants